CS 294 - 1 A 1 : Naive Bayesian Classifier
نویسنده
چکیده
Settings. Our codes were written in Scala and compiled under Simple Build Tool (SBT). The programs were run on Mac OS. We test the effectiveness of our implementation in various aspects. If not mentioned explicitly, we adopt the following default settings. We report macroaveraged F1 measures, which were further averaged by ten-fold cross validations. We consider both “Bernoulli” and “Multinomial” models. We use as features all the words preprocessed by stemming and stop-word elimination. Later discussions may unveil why certain default choices are made.
منابع مشابه
CS 294-1: Assignment 1 Naive Bayes Classification with Improvements
The main objective of this assignment was to implement a Naive Bayes classifier and attempt certain improvements upon the vanilla version. A major challenge was to implement the classifier in Scala using the two libraries scalala and scalanlp. This report presents details regarding the different experiments I tried out, namely varying the smoothing parameter, feature selection, n-gram models an...
متن کاملCS 294 - 1 Assignment 1 Report
Text classification has increasing potential applications in many aspects of information world, such as recommender systems and customer service. The goal of this assignment is to apply Naive Bayes classifier to a data set of labeled textual movie reviews and practice Scala/ScalaNLP. The data set “Polarity dataset v2.0” is from http://www.cs.cornell.edu/People/pabo/movie-reviewdata/, created by...
متن کاملA New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)
Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...
متن کاملCs 6604: Data Mining
In the last lecture we discussed the relationships between different modeling paradigms such as the Bayesian approach, Maximum A Posteriori (MAP) approach, Maximum Likelihood (ML) approach, and the Leastsquares (LS) method. In this lecture we first prove that equivalence of LS and ML under the assumption of normally distributed error. Then, the notions of the naive Bayesian classifier and the L...
متن کاملThe Indifferent Naive Bayes Classifier
The Naive Bayes classifier is a simple and accurate classifier. This paper shows that assuming the Naive Bayes classifier model and applying Bayesian model averaging and the principle of indifference, an equally simple, more accurate and theoretically well founded classifier can be obtained. Introduction In this paper we use Bayesian model averaging and the principle of indifference to derive a...
متن کامل